186

14

The Nature of Living Things

Differences Between Prokaryotes and Eukaryotes (2)

Bacterial genomes consist of blocks of genes preceded by regulatory (promoter)

sequences. Eukaryotic DNA resembles a mosaic of the following: genes (segments

whose sequence codes for amino acids, also called exons, from expressed, or “coding

DNA”); 20 segments (called introns) that are transcribed into RNA, but then excised

to leave the final mRNA used as the template for producing the protein; many genes

are split into a dozen or more segments, which can be spliced in different ways to

generate variant proteins after translation; promoters (short regions of DNA to which

RNA, proteins, or small molecules may bind, modulating the attachment of RNA

polymerase to the start of a gene); and intergenomic sequences (the rest, sometimes

called “junk” DNA in the same sense in which untranslated cuneiform tablets may

be called junk—we do not know what they mean). This is schematically illustrated

in Fig. 14.3.

Although the DNA-to-protein processing apparatus involves much complicated

molecular machinery, some RNA sequences can splice themselves. This autosplic-

ing capability enables exon shuffling to take place, suggesting the combinatorial

assembly of exons qua irreducible codewords as the basis of primitive, evolving life.

Organisms other than prokaryotes vary enormously in the proportion of their

genome that is not genes. The intergenomic material may exceed by more than an

order of magnitude the quantity of coding DNA. Some of the intergenomic mate-

rial is specially named, notably repetitive DNA. The main classes are the short (a

few hundred nucleotides) interspersed elements (SINES), the long (a few thousand

nucleotides) interspersed elements (LINES), and the tandem (i.e., contiguous) repeats

(minisatellites and microsatellites, 21 variable-length tandem repeats (VNTR), etc.). 22

These features can be highly specific for individual organisms. Several diseases are

associated with abnormalities in the pattern of repeats; for example, patients suf-

fering from X syndrome have hundreds or thousands of repeated CGG triplets at a

locus (i.e., place on the genome) where healthy individuals have about 30. The rôle

of repetition in DNA is still rather mysterious. One can amuse oneself by creating

sentences such as “can a perch perch?” or “will the wind wind round the tower?”

or “this exon’s exon was mistranslated” 23 to show that repetition is not necessarily

nonsense. The genome of the fruit fly Drosophila virilis has millions of repeats of

three satellites, ACAAACT, ATAAACT, and ACAAATT (reading from the5 prime5' to the

3 prime3' end), amounting to about 10 Superscript 8108 base pairs (i.e., comparable in length to the entire

20 The exome is the complete set of exons of an organism’s genome.

21 So called because their abnormal base composition, usually greatly enriched in C–G pairs (CpG),

results in satellite bands appearing near the main DNA bands when DNA is separated on a CsCl

density gradient.

22 Archaeal and bacterial genomes contain clustered regularly interspaced short palindromic repeats

(CRISPR; see, e.g., Sander and Joung (2014)). They have found technological application as a way

of genome editing.

23 Most English dictionaries give only one meaning for exon, namely one of four officers acting as

commanders of the Yeomen of the Guard of the Tower of London.